Naive Mixes for Word Sense Disambiguation

نویسنده

  • Ted Pedersen
چکیده

The Naive Mix is a new supervised learning algorithm based on sequential model selection. The usual objective of model selection is to nd a single proba-bilistic model that adequately characterizes, i.e. ts, the data in a training sample. The Naive Mix combines models discarded during the selection process with the best{{tting model to form an averaged probabilistic model. This is shown to improve classiication accuracy when applied to the problem of determining the meaning of an ambiguous word in a sentence. A probabilistic model consists of a parametric form and parameter estimates. The form of a model describes the interactions between the features of a sentence with an ambiguous word while the parameter estimates give the probability of observing each possible combination of feature values in a sentence. The class of models in a Naive Mix is restricted to decomposable log{linear models to reduce the model search space and simplify parameter estimation. The form of a decomposable model can be represented by an undirected graph whose nodes represent features and whose edges represent the interactions between features. The parameter estimates are the product of the marginal distributions, i.e. the maximal cliques in the graph of the model. Model selection integrates a search strategy with an evaluation criterion. The search strategy determines which decomposable models are evaluated during the selection process. The evaluation criterion measures the t of each model to the training sample. (Ped-ersen, Bruce, & Wiebe 1997) report that the strategy of forward sequential search (FSS) and evaluation by Akaike's information criteria (AIC) selects models that serve as accurate classiiers for word{sense disam-biguation. Here, this combination is shown to result in Naive Mixes that improve the accuracy of disambigua-tion over single selected models. Model selection guided by FSS evaluates the t of decomposable models at increasing levels of complexity , where complexity is deened as the number of edges in the graph of the model. The best{{tting model of complexity level i is designated the current model, m i. The models evaluated at complexity level i+1 are generated by adding one edge to m i and checking that the resultant model is decomposable. The evaluation begins with the model of independence where there are no interactions between features, i = 0, and ends when none of the generated models of complexity level i + 1 suuciently improves on the t of m i. The result is a sequence of …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simple Approach to Building Ensembles of Naive Bayesian Classi ers for Word Sense Disambiguation

This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classi ers, each of which is based on lexical features that represent co{occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves acc...

متن کامل

Applying a Naive Bayes Similarity Measure to Word Sense Disambiguation

We replace the overlap mechanism of the Lesk algorithm with a simple, generalpurpose Naive Bayes model that measures many-to-many association between two sets of random variables. Even with simple probability estimates such as maximum likelihood, the model gains significant improvement over the Lesk algorithm on word sense disambiguation tasks. With additional lexical knowledge from WordNet, pe...

متن کامل

Exemplar-Based Word Sense Disambiguation" Some Recent Improvements

In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of k, the number of nearest neighbors to use for determining the class of a test example, and through 10-fold cross validation to automatically determine the best k, we have obtained improved disambiguation ...

متن کامل

Raw Corpus Word Sense Disambiguation

A wide range of approaches have been applied to word sense disambiguation. However, most require manually crafted knowledge such as annotated text, machine readable dictionaries or thesari, semantic networks, or aligned bilingual corpora. The reliance on these knowledge sources limits portability since they generally exist only for selected domains and languages. This poster presents a corpus-b...

متن کامل

An Ensemble Approach to Corpus Based Word Sense Disambiguation

This paper presents a corpus{based approach to word sense disambiguation that combines a number of Naive Bayesian classiers into an ensemble that performs disambiguation via a majority vote. Each of the member classiers is based on collocation and co{occurrence features found in varying sized windows of context. This approach is motivated by the observation that, in general, enhancing the featu...

متن کامل

Selection Preference Basede Verb Sense Disambiguation Using WordNet

Selectional preferences are a source of linguistic information commonly applied to the task of Word Sense Disambiguation (WSD). To date, WSD systems using selectional preferences as the main disambiguation mechanism have achieved limited success. One possible reason for this limitation is the limited number of semantic roles used in the construction of selectional preferences. This study invest...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997